Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patatrack integration - GPU beamspot data format and transfer (4/N) #31130

Merged

Conversation

fwyzard
Copy link
Contributor

@fwyzard fwyzard commented Aug 13, 2020

PR description:

Transfer the beamspot data to the GPU, for use by the reconstruction algorithms running on the GPU.

Implement a BeamSpotCUDA transient data format.
Implement the beamspot host-to-device transfer in a dedicated EDProducer, making use of beginStream()-allocated write-combined memory and asynchronous copies for the transfer.

PR validation:

Changes in use in the Patatrack releases.

if this PR is a backport please specify the original PR and why you need to backport that PR:

Includes changes from

@cmsbuild cmsbuild added this to the CMSSW_11_2_X milestone Aug 13, 2020
@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

assign heterogeneous

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

@makortel FYI

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

@slava77 @perrotta @jpata this PR shows the approach used for CPU-GPU data transfers for a simple use case.

Comments about the approach, file organisation, naming scheme, etc. are welcome.

Note that this is still likely to evolve in the future;see for example cms-patatrack#272 and #29297 .

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

There are no stand-alone tests for this EDProducer, since the only consumers we have are the pixel reconstruction modules running on the GPU.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

@cmsbuild, please test

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

enable gpu

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 13, 2020

by the way - I'm not sure who is responsible for the beamspot in general... if you know, could you mention them and let them review this PR as well ?

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31130/17731

@cmsbuild
Copy link
Contributor

New categories assigned: heterogeneous

@makortel,@fwyzard you have been requested to review this Pull request/Issue and eventually sign? Thanks

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 13, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @fwyzard (Andrea Bocci) for master.

It involves the following packages:

CUDADataFormats/BeamSpot
RecoVertex/BeamSpotProducer

The following packages do not have a category, yet:

CUDADataFormats/BeamSpot
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@perrotta, @makortel, @slava77, @christopheralanwest, @tocheng, @cmsbuild, @tlampen, @jpata, @fwyzard, @pohsun can you please review it and eventually sign? Thanks.
@makortel, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @ebrondol, @tocheng, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

+1
Tested at: b7b6f0b
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-8c50dc/8737/summary.html
CMSSW: CMSSW_11_2_X_2020-08-12-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 40aeab8 into cms-sw:master Aug 26, 2020
@mrodozov
Copy link
Contributor

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 27, 2020

@mrodozov thanks for letting me know.

Looking at
image
I think the SIGSEGV in the first one is unrelated, but the NotFound ones are indeed due to this.

Let me check with @makortel the best way to fix it.

@fwyzard fwyzard deleted the patatrack_integration_4_N_beamspot branch August 27, 2020 08:52
@makortel
Copy link
Contributor

Seems that this PR also caused gcc10 build to fail, more specifically the dictionary generation for CUDADataFormats/BeamSpot

>> Building LCG reflex dict from header file src/CUDADataFormats/BeamSpot/src/classes.h
/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/lcg/root/6.20.06-ghbfee/bin/genreflex src/CUDADataFormats/BeamSpot/src/classes.h -s src/CUDADataFormats/BeamSpot/src/classes_def.xml -o tmp/slc7_amd64_gcc10/src/CUDADataFormats/BeamSpot/src/CUDADataFormatsBeamSpot/a/CUDADataFormatsBeamSpot_xr.cc --fail_on_warnings --rootmap=tmp/slc7_amd64_gcc10/src/CUDADataFormats/BeamSpot/src/CUDADataFormatsBeamSpot/a/CUDADataFormatsBeamSpot_xr.rootmap --rootmap-lib=libCUDADataFormatsBeamSpot.so -m DataFormatsCommon_xr_rdict.pcm -m FWCoreMessageLogger_xr_rdict.pcm -m DataFormatsProvenance_xr_rdict.pcm -m DataFormatsStdDictionaries_xr_rdict.pcm -m DataFormatsStdDictionaries_x1r_rdict.pcm -m DataFormatsStdDictionaries_x2r_rdict.pcm -m DataFormatsStdDictionaries_x3r_rdict.pcm -DCMS_DICT_IMPL -D_REENTRANT -DGNUSOURCE -D__STRICT_ANSI__ -DGNU_GCC -D_GNU_SOURCE -DTBB_USE_GLIBCXX_VERSION=100201 -DTBB_SUPPRESS_DEPRECATED_MESSAGES -DBOOST_SPIRIT_THREADSAFE -DPHOENIX_THREADSAFE -DBOOST_MATH_DISABLE_STD_FPCLASSIFY -DBOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX -DCMSSW_GIT_HASH="CMSSW_11_2_X_2020-08-26-2300" -DPROJECT_NAME="CMSSW" -DPROJECT_VERSION="CMSSW_11_2_X_2020-08-26-2300" -I/data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/b9e941603c5b9f7955ab4e5d570c815e/opt/cmssw/slc7_amd64_gcc10/cms/cmssw/CMSSW_11_2_X_2020-08-26-2300/src -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/pcre/8.43-cms/include -isystem/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/boost/1.72.0-ghbfee/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/bz2lib/1.0.6-cms/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/libuuid/2.34-cms/include -isystem/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/lcg/root/6.20.06-ghbfee/include -isystem/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/tbb/2020_U2-ghbfee/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/xz/5.2.4-cms/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/zlib/1.2.11-cms/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/fmt/7.0.1/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/md5/1.0.0-cms/include -I/data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc10/external/tinyxml2/6.2.0-cms/include -DCMSSW_REFLEX_DICT
In file included from input_line_8:54:
In file included from /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/b9e941603c5b9f7955ab4e5d570c815e/opt/cmssw/slc7_amd64_gcc10/cms/cmssw/CMSSW_11_2_X_2020-08-26-2300/src/CUDADataFormats/Common/interface/Product.h:6:
In file included from /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/b9e941603c5b9f7955ab4e5d570c815e/opt/cmssw/slc7_amd64_gcc10/cms/cmssw/CMSSW_11_2_X_2020-08-26-2300/src/CUDADataFormats/Common/interface/ProductBase.h:7:
/data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/b9e941603c5b9f7955ab4e5d570c815e/opt/cmssw/slc7_amd64_gcc10/cms/cmssw/CMSSW_11_2_X_2020-08-26-2300/src/HeterogeneousCore/CUDAUtilities/interface/SharedStreamPtr.h:7:10: fatal error: 'cuda_runtime.h' file not found
#include <cuda_runtime.h>
         ^~~~~~~~~~~~~~~~

(full log https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/slc7_amd64_gcc10/CMSSW_11_2_X_2020-08-26-2300/CUDADataFormats/BeamSpot)

If I've understood correctly, we have not enabled CUDA support for gcc10 (CudaGCCSupport unit test fails). I suppose we should protect the BuildFile.xml with <iftool name="cuda-gcc-support">, I can take care of it.

@fwyzard
Copy link
Contributor Author

fwyzard commented Aug 27, 2020

If I've understood correctly, we have not enabled CUDA support for gcc10 (CudaGCCSupport unit test fails).

I think I tried it in a local ara, but it failed to build some of our code :-(

I suppose we should protect the BuildFile.xml with <iftool name="cuda-gcc-support">, I can take care of it.

Please do - thanks !

@silviodonato
Copy link
Contributor

@mrodozov thanks for letting me know.

Looking at
image
I think the SIGSEGV in the first one is unrelated, but the NotFound ones are indeed due to this.

Let me check with @makortel the best way to fix it.

@fwyzard @makortel do you think #31261 should fix also the failing workflows in gcc820? (They seem two different problems)

fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 1, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 1, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 2, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 8, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 19, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 20, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Oct 23, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Nov 6, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Nov 16, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard added a commit to cms-patatrack/cmssw that referenced this pull request Nov 27, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Dec 25, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Dec 29, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Dec 29, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
fwyzard pushed a commit to cms-patatrack/cmssw that referenced this pull request Dec 29, 2020
Fixing BeamSpotCUDA naming in siPixelRecHitsCUDAPreSplitting to be compliant with changes made in cms-sw#31130 .
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.